testingiauh912fandomcom-20200214-history
Validity and social dimonsion
Validity and social dimention Validity theory has begun to develop ways of thinking about the social dimensions of the use of tests. Contemporary discussions of validity in educational assessment are heavily influenced by the thinking of the American Lee Cronbach. He was part of the American Psychological Association’s Committee on Psychological Tests, which Validity and the Social Dimension of Language Testing met between 1950 and 1954 to develop criteria of test quality. Their recommendations became the precursor of the present-day Standards for Educational and Psychological Testing, and during the commit-tee’s work, a subcommittee with Meehl and Challman coined the term “construct validity.” Cronbach and Meehl (1955) explicated the concept in detail in their classic article in the Psychological Bulletin. Whereas criterion-related validity (also known as predictive or concurrent validity) was the standard approach at the time, there was also increasing dissatisfaction with its shortcomings (Smith, 2005), and Cronbach and Meehl framed their new concept of construct validity as an alternative to criterion-related validity: Construct validity is ordinarily studied when the tester has no definite criterion measure of the quality with which he is concerned and must use in direct measures. Here the trait or quality underlying the test is of central importance, rather than either the test behavior or the scores on the criteria. (Cronbach & Meehl, 1955) Note that the language of “trait” and “underlying quality” frames the target of validation in the language of individuality and cognition. One of the major developments in validation re-searchsinceCronbachandMeehl’sarticleistheincreasinglycen-tral role taken by constructs validity, which has subsumed other types of validity and validation. There is also clear recognition that validity is not a mathematical property like discrimination or reliability, but a matter of judgment. Cronbach (1989) emphasized the need for a validity argument, which focuses on collecting evidence for or against a certain interpretation of test scores: In other words, it is the validity of inferences that construct validation work is concerned with, rather than the validity of instruments. In fact, Cronbach argued that there is no such thing as a “valid test,” only more or less defensible interpretations: “One does not validate a test, but only a principle for making inferences” (Cronbach&Meehl,1955). “One validates not a test, but an interpretation of data arising from a specified procedure”(Cronbach, 1971). Cronbach and Meehl (1955) distinguished between a weak and a strong program for construct validation. The weak one is a fairly haphazard collection of any sort of evidence (mostly correlational) that supports the particular interpretation to be validated. It is in fact a highly unprincipled attempt at verification by any means available. In contrast, the strong program is based on the falsification idea advanced by Popperian philosophy (Popper,1962): Rival hypotheses for interpretations are proposed and logically or empirically examined. In his most influential writings on validation within measurement, Cronbach never stressed the importance of the sociopolitical context and its influence on the whole testing enterprise; this is in marked contrast to his work on program evaluation (Cronbach et al., 1980), in which he strongly emphasized that evaluations are sites of political conflict and clashes of values. However, in his later writings, possibly through his experiences in program evaluation, Cronbach highlighted the role of beliefs and values in validity arguments, which “must link concepts, evidence, social and personal consequences, and values (Cronbach, 1988). He acknowledged that all interpretation involves questions of values: A persuasive defense of an interpretation will have to combine evidence, logic, and rhetoric. What is persuasive de-pends on the beliefs in the community. (Cronbach, 1989). And he concurred with Messick (1980) that validity work has an obligation to consider test consequences and helps to prevent negative ones. Cronbach also recognized that judgments of positive or negative consequences depend on social views of what is a desirable consequence, but that these views and values change overtime (Cronbach, 1990). What we have here then is a concern for social consequences as a kind of corrective to an earlier entirely cognitive and individualistic way of thinking about tests. The most influential current theory of validity of relevance to language testing remains that developed by Samuel Messick in his years at Educational Testing Service, Princeton from the1960s to the 1990s, most definitively set out in his much-cited1989 article (Messick, 1989). Although this framework has been discussed in detail elsewhere (Bachman, 1990; McNamara, 2006) and is well known in language testing, it is important to discuss the issues that it raises in the context of the argument of the present volume.Messickincorporatedasocialdimensionofassessmentquiteexplicitly within his model, but, as with Cronbach, it is grafted (somewhat uneasily, as we will see) onto the tradition of the psychological dimensions of measurement that he inherited and expanded. Messick, like Cronbach, saw assessment as a process of reasoning and evidence gathering carried out in order for inferences to be made about individuals and saw the task of establishing the meaningfulness and defensibility of those inferences as being the primary task of assessment development and re-search. This reflects an individualist, psychological tradition of measurement concerned with fairness. He introduced the social more explicitly into this picture by arguing two things: that our conceptions of what it is that we are measuring and the things we prioritize in measurement, will reflect values, which we can assume will be social and cultural in origin, and that tests have real effects in the educational and social contexts in which they are used and that these need to be matters of concern for those responsible for the test. Messick saw these aspects of validity as holding together within a unified theory of validity, which he set out in the form of a matrix. Now we can consider how validity theorists following Messick, and informed by his thinking, have interpreted and elaborated his approach, including its social dimensions. We will begin by setting out Messick’s own thinking on construct validity in greater detail and then see how it has been subsequently interpreted by two leading theorists: Mislevy and Kane. The concept of construct validity has traditionally dominated discussion of test validation, and it is important to understand what the issues are and in what way they involve social dimensions of assessment in terms of a concern for fairness. References Bachman, L. F. (1990).( Fundamental considerations in language testing. Oxford: Oxford University Press Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing .Applied Linguistics Cronbach, L. J. (1971). Validity. In R. L. Thorndike (.) Educational mea-surement 2 nd ed., pp. 443–597). Washington, DC: American Council on Education. Cronbach, L. J. (1988). Five perspectives on the validity argument. In H.Wainer & H. I. Test validity (pp. 3–18). Hillsdale, NJ:Lawrence Erlbaum Associates